NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A fast algorithm to factorize high-dimensional Tensor Product matrices used in Genetic Models

https://doi.org/10.1093/g3journal/jkae001

Lopez-Cruz, Marco; Pérez-Rodríguez, Paulino; de los Campos, Gustavo (January 2024, G3: Genes, Genomes, Genetics)
Lipka, Alexander (Ed.)
Abstract Many genetic models (including models for epistatic effects as well as genetic-by-environment) involve covariance structures that are Hadamard products of lower rank matrices. Implementing these models require factorizing large Hadamard product matrices. The available algorithms for factorization do not scale well for big data, making the use of some of these models not feasible with large sample sizes. Here, based on properties of Hadamard products and (related) Kronecker products we propose an algorithm that produces an approximate decomposition that is orders of magnitude faster than the standard eigenvalue decomposition. In this article, we describe the algorithm, show how it can be used to factorize large Hadamard product matrices, present benchmarks, and illustrate the use of the method by presenting an analysis of data from the northern testing locations of the G×E project from the Genomes-to-Fields Initiative (n∼60,000). We implemented the proposed algorithm in the open-source ‘tensorEVD’ R-package.
more » « less
Full Text Available
Leveraging data from the Genomes-to-Fields Initiative to investigate genotype-by-environment interactions in maize in North America

https://doi.org/10.1038/s41467-023-42687-4

Lopez-Cruz, Marco; Aguate, Fernando M.; Washburn, Jacob D.; de Leon, Natalia; Kaeppler, Shawn M.; Lima, Dayane Cristina; Tan, Ruijuan; Thompson, Addie; De La Bretonne, Laurence Willard; de los Campos, Gustavo (October 2023, Nature Communications)

Abstract Genotype-by-environment (G×E) interactions can significantly affect crop performance and stability. Investigating G×E requires extensive data sets with diverse cultivars tested over multiple locations and years. The Genomes-to-Fields (G2F) Initiative has tested maize hybrids in more than 130 year-locations in North America since 2014. Here, we curate and expand this data set by generating environmental covariates (using a crop model) for each of the trials. The resulting data set includes DNA genotypes and environmental data linked to more than 70,000 phenotypic records of grain yield and flowering traits for more than 4000 hybrids. We show how this valuable data set can serve as a benchmark in agricultural modeling and prediction, paving the way for countless G×E investigations in maize. We use multivariate analyses to characterize the data set’s genetic and environmental structure, study the association of key environmental factors with traits, and provide benchmarks using genomic prediction models.
more » « less
Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits

https://doi.org/10.1534/g3.119.400498

Azodi, Christina B.; Bolger, Emily; McCarren, Andrew; Roantree, Mark; de los Campos, Gustavo; Shiu, Shin-Han (November 2019, G3: Genes|Genomes|Genetics)

Full Text Available
Transcriptome-Based Prediction of Complex Traits in Maize

https://doi.org/10.1105/tpc.19.00332

Azodi, Christina B.; Pardo, Jeremy; VanBuren, Robert; de los Campos, Gustavo; Shiu, Shin-Han (October 2019, The Plant Cell)

Search for: All records